pseudo ground-truth piano roll
Review for NeurIPS paper: Audeo: Audio Generation for a Silent Performance Video
Summary and Contributions: This paper proposes a novel pipeline approach for improving piano music/audio generation from silent videos with a top-view of a pianist's fingers playing on a keyboard. Prior work [27] used an end-to-end approach to directly predict a symbolic piano performance from video using ResNets. This paper points out there's a lot of mismatch between the video and music/audio streams and hence the processing requires multiple stages of transformation. The proposed pipeline consists of three interpretable components / stages. Video2Roll consists of three stages.
Industry:
- Leisure & Entertainment (0.79)
- Media > Music (0.38)